DOCUMENT RESUME 



ED 465 803 



TM 034 190 



AUTHOR 

TITLE 



INSTITUTION 



SPONS AGENCY 

REPORT NO 
PUB DATE 
NOTE 

CONTRACT 
AVAILABLE FROM 



PUB TYPE 
EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Klein, Davina C. D.; Chung, Gregory K. W. K. ; Osmundson, 
Ellen; Herl, Howard E.; O'Neil, Harold F. , Jr. 

Examining the Validity of Knowledge Mapping as a Measure of 
Elementary Students' Scientific Understanding. CSE Technical 
Report . 

California Univ., Los Angeles. Center for the Study of 
Evaluation.; National Center for Research on Evaluation, 
Standards, and Student Testing, Los Angeles, CA. 

Office of Educational Research and Improvement (ED) , 
Washington, DC. 

CSE-TR- 557 
2002-04-00 
4 5p . 

R305B60002 

Center for the Study of Evaluation, National Center for 
Research on Evaluation, Standards, and Student Testing, 
Graduate School of Education & Information Studies, 
University of California at Los Angeles, 300 Charles E. 

Young Dr. North, Los Angeles, CA 90095-1522. Tel: 
310-266-1532. For full text: 

http : //www. cse .ucla . edu/CRESST/Reports/TECH559 . PDF. 

Reports - Research (143) 

MF01/PC02 Plus Postage. 

Comprehension; *Elementary School Students; Intermediate 
Grades; Multitrait Multimethod Techniques; *Reliability ; 
♦Scientific Principles; Scoring; Student Evaluation; 
♦Validity 
♦Knowledge Maps 



ABSTRACT 



Knowledge mapping is expected to measure deep conceptual 
understanding and allow students to characterize relationships among concepts 
in a domain visually. This research examined the validity of knowledge 
mapping as an assessment tool in science. The approach to investigating this 
validity was three-pronged. .First, a model was outlined for the creation of 
knowledge mapping tasks, proposing a standard set of steps and using content 
area and educational experts to ensure the content validity of the measures. 
Next, a scoring method was developed to evaluate student performance. This 
report contains a discussion of the methods reliability and its relation to 
other possible scoring systems. Finally, the report presents the statistical 
results based on the participation of 56 fourth and fifth graders, including 
comparative analyses, the multitrait-multimethod (MTMM) validity analyses 
involving two traits (students understanding of hearing and of vision) , and 
three different measurement methods (knowledge mapping, essay, and 
multiple-choice tasks) , critical proposition analyses, and analyses of 
students, propositional elaborations. Results show knowledge maps to be 
sensitive to students competency level, with mixed MTMM results. The report 
concludes with a discussion of implications and directions for future work. 
(Contains 7 tables, 2 figures, and 33 references.) (Author/SLD) 



Reproductions supplied by EDRS are the best that can be made 
from the original document. 



' V s . 

ftjt K> 

1 ‘^<vya 



rM 


'3' 






M. 






Vyt.; ■' \ 1 




U.S. DEPARTMENT OF EDUCATION 
Office of Educational Research and Improvement 
EDUCATIONAL RESOURCES INFORMATION 
/ CENTER (ERIC) 

H This document has been reproduced as 
received from the person or organization 
originating it. 

□ Minor changes have been made to 
improve reproduction quality. 

• Points of view or opinions stated in this 
document do not necessarily represent 
official OERI position or policy. 



PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL HAS 
BEEN GRANTED BY 






TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 




, 









w>-. HI 




Examining the Validity of Knowledge Mapping 
as a Measure of Elementary Students' 
Scientific Understanding 

CSE Technical Report No. 557 

Davina C. D. Klein, Gregory K. W.K. Chung, 
Ellen Osmundson, and Howard E. Herl 
CRESST/University of California, Los Angeles 

Harold F. O'Neil, Jr. 

University of Southern California/ CRESST 



"^T~ 






Examining the Validity of Knowledge Mapping 
as a Measure of Elementary Students' 
Scientific Understanding 



CSE Technical Report No. 557 

Davina C. D. Klein, Gregory K. W. K. Chung, 
Ellen Osmundson, and Howard E. Herl 
CRESST/University of California, Los Angeles 

Harold F. O'Neil, Jr. 

University of Southern California/ CRESST 



April 2002 



Center for the Study of Evaluation 
National Center for Research on Evaluation, 
Standards, and Student Testing 
Graduate School of Education & Information Studies 
University of California, Los Angeles 
Los Angeles, CA 90095-1522 
(310) 206-1532 




3 



Project 1.3. Technology in Action 

Eva Baker, CRESST/ University of California, Los Angeles, Project Director 
Copyright © 2002 The Regents of the University of California 

The work reported herein was supported under the Educational Research and Development Centers 
Program, PR/Award Number R305B60002, as administered by the Office of Educational Research 
and Improvement, U. S. Department of Education. 

The findings and opinions expressed in this report do not reflect the positions or policies of the 
National Institute on Student Achievement, Curriculum, and Assessment, the Office of Educational 
Research and Improvement, or the U. S. Department of Education. 



ERjt 



A 



THE VALIDITY OF KNOWLEDGE MAPPING 
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Abstract 

Although first popular as an instructional tool in the classroom, knowledge mapping has 
been used increasingly as an assessment tool. Knowledge mapping is expected to 
measure deep conceptual understanding and allow students to characterize relationships 
among concepts in a domain visually. Our research examines the validity of knowledge 
mapping as an assessment tool in science. Our approach to investigating this validity is 
three-pronged. First, we outline a model for the creation of knowledge mapping tasks, 
proposing a standard set of steps and using content area and educational experts to 
ensure the content validity of the measures. Next, we describe a scoring method used to 
evaluate student performance, including a discussion of the method's reliability and its 
relationship to other possible scoring systems. Finally, we present our statistical results 
including comparative analyses, our multitrait-multimethod validity analyses involving 
two traits (students' understanding of hearing and of vision) and three different 
measurement methods (knowledge mapping, essay, and multiple-choice tasks), critical 
proposition analyses, and analyses of students' propositional elaborations. Results show 
knowledge maps to be sensitive to students' competency level, with mixed MTMM 
results. We conclude with a discussion of implications and directions for future work. 

As evidence mounts regarding the limitations of standardized multiple-choice 
testing, educators increasingly are looking for ways of assessing students' scientific 
conceptual understanding that may be missed by more traditional measures. 
Students' performance on knowledge mapping tasks has emerged as one possible 



1 We wish to acknowledge and thank Joanne Michiuye and Ali Abedi of CRESST/UCLA for their 
invaluable technical support, Robby Klein and Andrew Shpall for their content expertise, and Uyen 
Bui for her interest, participation, and content expertise. Finally, we are deeply grateful to Sharon 
Sutton, Jan Cohn, and their students for their assistance and participation in this research. 

2 Howard Herl is now at the Los Angeles County Office of Education. 
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source of information regarding their scientific knowledge. Knowledge mapping 3 
allows students to represent their understanding graphically, using nodes to 
represent main ideas and links to represent the relationships between the ideas. 
Students construct knowledge maps to demonstrate their knowledge of the 
important relationships among main ideas within a domain (see Figure 1 for a 
sample knowledge map). 

Knowledge mapping has been used extensively in instructional situations to 
facilitate understanding of subject matter, to allow summarization of important 
information, to aid in recall in review situations, and to characterize the structure of 
text (Heinze-Fry & Novak, 1990; Horton et al., 1993; Jonassen, Beissner, & Yacci, 
1993; Novak & Gowin, 1984). 




Figure 1. Sample knowledge map. 



3 We use the term knowledge mapping rather than concept mapping because we believe it to be a broader 
term, encompassing both conceptual knowledge and other types of knowledge (e.g., procedural 
knowledge). However, these tasks certainly could be characterized as concept- mapping tasks. 



Research indicates that students who use knowledge maps are better at 
integrating, organizing, comprehending, retaining, and recalling new material 
(Armbruster & Anderson, 1984; Holley & Dansereau, 1984; Jonassen et al., 1993, 
Okebukola & Jegede, 1988). As instruction and assessment intersect, the use of 
knowledge mapping has moved into the assessment arena (Baker, Niemi, Novak, & 
Herl, 1992; Chung, O'Neil, & Herl, 1999; Herl, 1995; Herl, Baker, & Niemi, 1996; 
Jonassen et al., 1993; McClure, Sonak, & Suen, 1999; Novak, 1998; Ruiz-Primo & 
Shavelson, 1995; Ruiz-Primo, Shavelson, & Shultz, 1997). Researchers have 
suggested that as an assessment task, knowledge mapping can elicit students' deep 
conceptual understanding. Like an essay task, a knowledge map allows students to 
demonstrate their understanding of relationships between complex concepts. 
However, unlike essays, which are usually scored by human raters and often at 
considerable expense (Hardy, 1995, estimates the cost of scoring essays to range 
from $3 to $6 per essay, using a holistic rubric and a rater scoring rate of 12 minutes 
per essay), knowledge maps can be scored via computer. 

The advantages of knowledge maps as assessment tools are numerous. The 
mechanics of constructing knowledge maps are easy, and students are cjuick to learn 
how to construct both paper-and-pencil and computer-based knowledge maps. In 
addition, having been used successfully as a learning device, knowledge maps allow 
a link between instruction and assessment, boosting the face validity of knowledge 
mapping as an assessment tool. Further, due to recently developed computerized 
scoring solutions, scoring of knowledge maps is straightforward and cost-effective. 
Finally, and most importantly, we believe knowledge mapping allows us to evaluate 
deep conceptual understanding. However, although research studies have 
documented the benefits of knowledge mapping in instructional settings, issues of 
validity in assessment settings remain unanswered. 

Our hypothesis is that knowledge maps yield information about student 
competency that is different from, yet overlapping with, information revealed by 
essays and multiple-choice tasks. Although knowledge mapping may not be able to 
give us the kind of detailed and in-depth information about student understanding 
possible using an essay task, it seems nonetheless to allow us a glimpse into 
students' deep understanding in a domain— one that multiple-choice tasks may not 
allow. 

Our approach to investigating the validity of knowledge mapping as an 
assessment task is three-pronged. First, we outline a model for the creation of 
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knowledge mapping tasks, proposing a standard set of steps and using content area 
and educational experts to ensure the content validity of the measures. Next, we 
describe the scoring method used to evaluate student performance, including a 
discussion of the method's reliability and its relationship to other possible scoring 
systems. Finally, we present our statistical results including comparative analyses, 
our multitrait-multimethod validity analyses, which involve two traits (students' 
understanding of hearing and of vision) and three different measurement methods 
(knowledge mapping, essay, and multiple-choice tasks), critical proposition 
analyses, and analyses of students' propositional elaborations. By addressing these 
three properties of knowledge maps — their creation, their scoring, and their 
relationship to other measures of understanding — we aim to gather evidence on the 
validity of knowledge maps for measuring students' scientific understanding. 

Development Model for Knowledge Mapping Tasks 

Our model for the creation of knowledge mapping tasks uses both content area 
experts (e.g., scientists, historians, mathematicians) and instructional experts (e.g., 
teachers, researchers) working together to devise a task to measure students' 
conceptual understandings. Underlying the mapping task is a scoring system that 
credits students for creating propositions that are like those of experts; students' 
maps are scored against multiple experts' maps rather than against one correct 
answer. 

The development of knowledge mapping assessment tasks involves five steps 
(see Table 1 for process summary). First, the topic area to be assessed is specified. 
Experts in the field (e.g., medical practitioners, historians, scientists) are asked to 
generate lists of the most important "big ideas" within the topic area. Previous 
expert-novice research has shown that experts generally organize their knowledge 
around certain key principles or important ideas, then link these ideas together in a 
principled manner (Chase & Simon, 1973; Chi, Feltovich, & Glaser, 1981; Chi, Glaser, 
& Farr, 1988). Thus, these concepts serve to anchor the knowledge map within a 
particular topic area. Next, supporting classroom material, textbooks, and teacher 
interviews are used to tailor the experts' lists of concepts to the particular student 
audience. For example, in this study, medical students first described the hearing 
process (using terms such as sound waves, ear drum, and pitch); then teachers adjusted 
the material to the appropriate grade level (deleting, for instance, the term pitch). 
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Table 1 

Development Model for Knowledge Mapping Task 



tep Process 

1 Specify topic area and ask experts to generate important concepts. 

2 Review and tailor concept list to particular student audience. 

3 Construct preliminary knowledge maps with concept list in order to 
generate set of linking words. 

4 Review and tailor link list to particular student audience. 

5 Construct final knowledge maps to be used for scoring purposes. 



After identifying an appropriate set of important ideas or concepts, experts are 
asked to create preliminary knowledge maps using their own linking words. 
Concepts are connected with directional lines to create concept-link-concept sets, 
termed propositions , 4 which act to form a sentence. For instance, one might create the 
proposition nerve sends message to brain. Expert maps are compared, links are 
discussed with the teachers, and a final list of links appropriate to the students' level 
is generated from this information. For instance, in our hearing task, the list 
included links such as is part of, vibrates, and is connected to. 

In the last step of the process, experts again construct knowledge maps in the 
topic area, this time using the specified set of final concepts and links. These expert 
maps are later used to score student maps. 

Scoring of Knowledge Mapping Tasks 

The method of scoring student outcome maps is a crucial concern when using 
knowledge mapping in an assessment setting. Unlike traditional multiple-choice 
tests, there is clearly no one "correct" knowledge map for a given topic domain. A 
variety of student knowledge maps containing different sets of propositions all 
could score well in comparison to experts' maps. On the other hand, scoring of 
knowledge maps also differs significantly from scoring of other performance-based 
assessment tasks, such as the essay tasks used in this study. Whereas raters can be 
trained to score explanation tasks holistically— using benchmark papers and an 
explicit scoring rubric — cost factors and the complexity of a 10-term or 15-term 



4 vVe use the terms proposition and link at times interchangeably. Technically, a link is only the 
connection between two concept words; in practice, we use the term link to describe both the link 
between terms and the concept-link-concept proposition set. In context, it will be clear to which we 
refer. 





9 



knowledge map with numerous links preclude this kind of evaluation approach at 
any more than a rudimentary level (e.g., McClure et al., 1999). Regardless of the 
scoring scenario, the validity and reliability of knowledge map scores are of utmost 
importance. 

Research on knowledge mapping has often focused on hierarchical knowledge 
maps, a more restrictive type of map (see, for instance, Markham, Mintzes, & Jones, 
1994; Novak & Gowin, 1984; Wallace & Mintzes, 1990). Scoring methods for these 
types of maps generally include using the number of hierarchical and cross-links as 
measures of content knowledge. However, because much content knowledge need 
not be represented hierarchically (e.g., hearing and vision processes) and since less 
ordered methods of constructing knowledge maps allow for hierarchical structure, 
we do not find scoring schemes that expect hierarchically structured knowledge 
maps to be of much benefit in scoring less structured maps. Other research dealing 
with less structured knowledge maps (for example, Austin & Shore, 1995) has used 
more basic scoring techniques, often assigning scores based on the number of 
concepts and/or links, number of “good links," and so on — a somewhat limited and 
often arbitrary system. Finally, some researchers have used expert maps as the 
criteria for scoring students' maps (Herl, 1995; Herl et al., 1996; McClure et al., 1999; 
Ruiz-Primo, Schultz, & Shavelson 1997). 

Using criterion maps has the advantage of allowing researchers to score 
student knowledge maps in various ways. For instance, rather than simply adding 
up terms and links, the degree to which a student's map matches an expert's map 
can vary by the definition of "match." Ruiz-Primo, Schultz, and Shavelson (1997) 
used three different map scores: (a) a total score, defined as the number of valid 
student links; (b) a congruence score, defined as the proportion of valid student 
links to all criterion links; and (c) a salience score, defined as the proportion of valid 
student links to all student links. Herl and his colleagues (1996) used a matching 
algorithm involving multiple expert maps in the scoring of each student map. Thus, 
matching entailed having the same proposition as any expert, and the degree of 
matching was weighted by the proportion of experts a student matched. Herl et al. 
calculated two related mapping scores: (a) a stringent semantic content score, based 
on exact link matches between student links and expert links; and (b) a categorized 
semantic content score, based on students matching some set of possible links (e.g., 
the causal set of links included contributed to, encourages, and led to links). 
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